Goto

Collaborating Authors

 cryptic clue




A Reasoning-Based Approach to Cryptic Crossword Clue Solving

Andrews, Martin, Witteveen, Sam

arXiv.org Artificial Intelligence

Cryptic crossword clues are challenging language tasks for which new test sets are released daily by major newspapers on a global basis. Each cryptic clue contains both the definition of the answer to be placed in the crossword grid (in common with regular crosswords), and 'wordplay' that proves that the answer is correct (i.e. a human solver can be confident that an answer is correct without needing crossing words as confirmation). This work describes an LLM-based reasoning system built from open-licensed components that solves cryptic clues by (i) hypothesising answers; (ii) proposing wordplay explanations; and (iii) using a verifier system that operates on codified reasoning steps. Overall, this system establishes a new state-of-the-art performance on the challenging Cryptonite dataset of clues from The Times and The Telegraph newspapers in the UK. Because each proved solution is expressed in Python, interpretable wordplay reasoning for proven answers is available for inspection.


What Makes Cryptic Crosswords Challenging for LLMs?

Sadallah, Abdelrahman, Kotova, Daria, Kochmar, Ekaterina

arXiv.org Artificial Intelligence

Cryptic crosswords are puzzles that rely on general knowledge and the solver's ability to manipulate language on different levels, dealing with various types of wordplay. Previous research suggests that solving such puzzles is challenging even for modern NLP models, including Large Language Models (LLMs). However, there is little to no research on the reasons for their poor performance on this task. In this paper, we establish the benchmark results for three popular LLMs: Gemma2, LLaMA3 and ChatGPT, showing that their performance on this task is still significantly below that of humans. We also investigate why these models struggle to achieve superior performance. We release our code and introduced datasets at https://github.com/bodasadallah/decrypting-crosswords.


Are LLMs Good Cryptic Crossword Solvers?

Sadallah, Abdelrahman "Boda", Kotova, Daria, Kochmar, Ekaterina

arXiv.org Artificial Intelligence

Cryptic crosswords are puzzles that rely not only on general knowledge but also on the solver's ability to manipulate language on different levels and deal with various types of wordplay. Previous research suggests that solving such puzzles is a challenge even for modern NLP models. However, the abilities of large language models (LLMs) have not yet been tested on this task. In this paper, we establish the benchmark results for three popular LLMs -- LLaMA2, Mistral, and ChatGPT -- showing that their performance on this task is still far from that of humans.


ChatGPT: can artificial intelligence create crosswords?

The Guardian

First, if you're a solver of the Mephisto series – which is unusual in giving the actual names of its setters – and have wondered what Paul McKenna does when he's not setting, you can now find out. The same setter is the Financial Times' Jason, and that paper interviews him as part of "an occasional series": Did your school mention crossword compiling in career discussions? It was never mentioned as a career option. I am a construction manager in the oil and gas pipeline industry. It is still a rare event for us to welcome a new compiler to the series.


Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP

Rozner, Josh, Potts, Christopher, Mahowald, Kyle

arXiv.org Artificial Intelligence

Cryptic crosswords, the dominant English-language crossword variety in the United Kingdom, can be solved by expert humans using flexible, creative intelligence and knowledge of language. Cryptic clues read like fluent natural language, but they are adversarially composed of two parts: a definition and a wordplay cipher requiring sub-word or character-level manipulations. As such, they are a promising target for evaluating and advancing NLP systems that seek to process language in more creative, human-like ways. We present a dataset of cryptic crossword clues from a major newspaper that can be used as a benchmark and train a sequence-to-sequence model to solve them. We also develop related benchmarks that can guide development of approaches to this challenging task. We show that performance can be substantially improved using a novel curriculum learning approach in which the model is pre-trained on related tasks involving, e.g, unscrambling words, before it is trained to solve cryptics. However, even this curricular approach does not generalize to novel clue types in the way that humans can, and so cryptic crosswords remain a challenge for NLP systems and a potential source of future innovation.


Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language

Efrat, Avia, Shaham, Uri, Kilman, Dan, Levy, Omer

arXiv.org Artificial Intelligence

Current NLP datasets targeting ambiguity can be solved by a native speaker with relative ease. We present Cryptonite, a large-scale dataset based on cryptic crosswords, which is both linguistically complex and naturally sourced. Each example in Cryptonite is a cryptic clue, a short phrase or sentence with a misleading surface reading, whose solving requires disambiguating semantic, syntactic, and phonetic wordplays, as well as world knowledge. Cryptic clues pose a challenge even for experienced solvers, though top-tier experts can solve them with almost 100% accuracy. Cryptonite is a challenging task for current models; fine-tuning T5-Large on 470k cryptic clues achieves only 7.6% accuracy, on par with the accuracy of a rule-based clue solver (8.6%).